A Probabilistic Lexical Approach to Textual Entailment
نویسندگان
چکیده
The textual entailment problem is to determine if a given text entails a given hypothesis. This paper describes first a general generative probabilistic setting for textual entailment. We then focus on the sub-task of recognizing whether the lexical concepts present in the hypothesis are entailed from the text. This problem is recast as one of text cate-gorization in which the classes are the vocabulary words. We make novel use of Naïve Bayes to model the problem in an entirely unsupervised fashion. Empirical tests suggest that the method is effective and compares favorably with state-of-the-art heuristic scoring approaches. Many Natural Language Processing (NLP) applications need to recognize when the meaning of one text can be expressed by, or inferred from, another text. Information Retrieval (IR), Question Answering (QA), Information Extraction (IE) and text summarization are examples of applications that need to assess such semantic overlap between text segments. Textual Entailment Recognition has recently been proposed as an application independent task to capture such semantic inferences and variability [Dagan et al., 2005]. A text t textually entails a hypothesis h if t implies the truth of h. Textual entailment captures generically a broad range of inferences that are relevant for multiple applications. For example, a QA system has to identify texts that entail the expected answer. Given the question "Where was Harry Reasoner born?", a text that includes the sentence "Harry Reasoner's birthplace is Iowa" entails the expected answer form "Harry Reasoner was born in Iowa." In many cases, though, entailment inference is uncertain and has a probabil-istic nature. For example, a text that includes the sentence "Harry Reasoner is returning to his Iowa hometown to get married." does not deterministically entail the above answer form. Yet, it is clear that it does add substantial information about the correctness of the hypothesized assertion. A Probabilistic Setting We propose a general generative probabilistic setting for textual entailment. We assume that a language source generates texts within the context of some state of affairs. Thus, texts are generated along with hidden truth assignments to hypotheses. We define two types of events over the corresponding probability space: I) For a hypothesis h, we denote as Tr h the random variable whose value is the truth value assigned to h in the world of the generated text. Correspondingly, Tr h =1 is the event of h being assigned a truth value of 1 (True). II) For …
منابع مشابه
A Probabilistic Setting And Lexical Coocurrence Model For Textual Entailment
This paper proposes a general probabilistic setting that formalizes a probabilistic notion of textual entailment. We further describe a particular preliminary model for lexical-level entailment, based on document cooccurrence probabilities, which follows the general setting. The model was evaluated on two application independent datasets, suggesting the relevance of such probabilistic approache...
متن کاملA Probabilistic Setting and Lexical Cooccurrence Model for Textual Entailment
This paper proposes a general probabilistic setting that formalizes a probabilistic notion of textual entailment. We further describe a particular preliminary model for lexical-level entailment, based on document cooccurrence probabilities, which follows the general setting. The model was evaluated on two application independent datasets, suggesting the relevance of such probabilistic approache...
متن کاملA Lexical Alignment Model for Probabilistic Textual Entailment
This paper describes the Bar-Ilan system participating in the Recognising Textual Entailment Challenge. The paper proposes first a general probabilistic setting that formalizes the notion of textual entailment. We then describe a concrete alignment-based model for lexical entailment, which utilizes web co-occurrence statistics in a bag of words representation. Finally, we report the results of ...
متن کاملA Probabilistic Classification Approach for Lexical Textual Entailment
The textual entailment task – determining if a given text entails a given hypothesis – provides an abstraction of applied semantic inference. This paper describes first a general generative probabilistic setting for textual entailment. We then focus on the sub-task of recognizing whether the lexical concepts present in the hypothesis are entailed from the text. This problem is recast as one of ...
متن کاملWeb Based Probabilistic Textual Entailment
This paper proposes a general probabilistic setting that formalizes the notion of textual entailment. In addition we describe a concrete model for lexical entailment based on web co-occurrence statistics in a bag of words representation.
متن کاملTowards a Probabilistic Model for Lexical Entailment
While modeling entailment at the lexical-level is a prominent task, addressed by most textual entailment systems, it has been approached mostly by heuristic methods, neglecting some of its important aspects. We present a probabilistic approach for this task which covers aspects such as differentiating various resources by their reliability levels, considering the length of the entailed sentence...
متن کامل